A continuous VQ clustering algorithm for realtime speech recognition
نویسندگان
چکیده
This paper presents a continuous VQ clustering (CVQC) algorithm for realtime speech recognition, which incorporates the temporal information of speech into both training and recognition processes. In comparison with the conventional DTW and VQ methods, this new algorithm delivers faster training and recognition speed and smaller codebook size while still retains merits of both. Realtime implementation is emphasized in the design of sophisticated algorithms. And a custom available voice controlled computer command input system based on CVQC is also introduced.
منابع مشابه
Clustering beyond phoneme contexts for speech recognition
The clustering of using decision trees is generalized to take into account high-level knowledge sources to better model the co-articulation e ects in large vocabulary continuous speech recognition. VQ models are used to reduce the computational cost in constructing decision trees. The search algorithm is designed such that it can provide a general type of information for decision trees without ...
متن کاملImproved Bayesian Training for Context-Dependent Modeling in Continuous Persian Speech Recognition
Context-dependent modeling is a widely used technique for better phone modeling in continuous speech recognition. While different types of context-dependent models have been used, triphones have been known as the most effective ones. In this paper, a Maximum a Posteriori (MAP) estimation approach has been used to estimate the parameters of the untied triphone model set used in data-driven clust...
متن کاملSpeaker change detection and speaker clustering using VQ distortion for broadcast news speech recognition
This paper addresses the problem of the detection of speaker changes and clustering speakers when no information is available regarding speaker classes or even the total number of classes. We assume that no previous information on speakers is available (no speaker model, no training phase) and that people do not speak simultaneously. The aim is to apply speaker grouping information to speaker a...
متن کاملEvaluation of ETSI advanced DSR front-end and bias removal method on the Japanese newspaper article sentences speech corpus
In October 2002, European Telecommunications Standards Institute (ETSI) recommended a standard Distributed Speech Recognition (DSR) advanced front-end, ETSI ES202 050 version 1.1.1 (ES202). Many studies use this front-end in noise environments on several languages on connected digit recognition tasks. However, we have not seen the reports of large vocabulary continuous speech recognition using ...
متن کاملA New Vector Quantization Front-End Process for Discrete HMM Speech Recognition System
The paper presents a complete discrete statistical framework, based on a novel vector quantization (VQ) front-end process. This new VQ approach performs an optimal distribution of VQ codebook components on HMM states. This technique that we named the distributed vector quantization (DVQ) of hidden Markov models, succeeds in unifying acoustic micro-structure and phonetic macro-structure, when th...
متن کامل